45 research outputs found

    Mapping liver fat female-dependent quantitative trait loci in collaborative cross mice

    Get PDF
    Non-alcoholic fatty liver disease (NAFLD) is the most common cause of chronic liver disease in the western world, with spectrum from simple steatosis to non-alcoholic steatohepatitis, which can progress to cirrhosis. NAFLD developments are known to be affected by host genetic background. Herein we emphasize the power of collaborative cross (CC) mouse for dissecting this complex trait and revealing quantitative trait loci (QTL) controlling hepatic fat accumulation in mice. 168 female and 338 male mice from 24 and 37 CC lines, respectively, of 18-20 weeks old, maintained on standard rodent diet, since weaning. Hepatic fat content was assessed, using dual DEXA scan in the liver. Using the available high-density genotype markers of the CC line, QTL mapping associated with percentage liver fat accumulation was performed. Our results revealed significant fatty liver accumulation QTL that were specifically, mapped in females. Two significant QTLs on chromosomes 17 and 18, with genomic intervals 3 and 2 Mb, respectively, were mapped. A third QTL, with a less significant P value, was mapped to chromosome 4, with genomic interval of 2 Mb. These QTLs were named Flal1-Flal3, referring to Fatty Liver Accumulation Locus 1-3, for the QTLs on chromosomes 17, 18, and 4, respectively. Unfortunately, no QTL was mapped with males. Searching the mouse genome database suggested several candidate genes involved in hepatic fat accumulation. Our results show that susceptibility to hepatic fat accumulations is a complex trait, controlled by multiple genetic factors in female mice, but not in male

    Evaluation of clustering algorithms for gene expression data

    Get PDF
    BACKGROUND: Cluster analysis is an integral part of high dimensional data analysis. In the context of large scale gene expression data, a filtered set of genes are grouped together according to their expression profiles using one of numerous clustering algorithms that exist in the statistics and machine learning literature. A closely related problem is that of selecting a clustering algorithm that is "optimal" in some sense from a rather impressive list of clustering algorithms that currently exist. RESULTS: In this paper, we propose two validation measures each with two parts: one measuring the statistical consistency (stability) of the clusters produced and the other representing their biological functional congruence. Smaller values of these indices indicate better performance for a clustering algorithm. We illustrate this approach using two case studies with publicly available gene expression data sets: one involving a SAGE data of breast cancer patients and the other involving a time course cDNA microarray data on yeast. Six well known clustering algorithms UPGMA, K-Means, Diana, Fanny, Model-Based and SOM were evaluated. CONCLUSION: No single clustering algorithm may be best suited for clustering genes into functional groups via expression profiles for all data sets. The validation measures introduced in this paper can aid in the selection of an optimal algorithm, for a given data set, from a collection of available clustering algorithms

    Deregulation upon DNA damage revealed by joint analysis of context-specific perturbation data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Deregulation between two different cell populations manifests itself in changing gene expression patterns and changing regulatory interactions. Accumulating knowledge about biological networks creates an opportunity to study these changes in their cellular context.</p> <p>Results</p> <p>We analyze re-wiring of regulatory networks based on cell population-specific perturbation data and knowledge about signaling pathways and their target genes. We quantify deregulation by merging regulatory signal from the two cell populations into one score. This joint approach, called JODA, proves advantageous over separate analysis of the cell populations and analysis without incorporation of knowledge. JODA is implemented and freely available in a Bioconductor package 'joda'.</p> <p>Conclusions</p> <p>Using JODA, we show wide-spread re-wiring of gene regulatory networks upon neocarzinostatin-induced DNA damage in Human cells. We recover 645 deregulated genes in thirteen functional clusters performing the rich program of response to damage. We find that the clusters contain many previously characterized neocarzinostatin target genes. We investigate connectivity between those genes, explaining their cooperation in performing the common functions. We review genes with the most extreme deregulation scores, reporting their involvement in response to DNA damage. Finally, we investigate the indirect impact of the ATM pathway on the deregulated genes, and build a hypothetical hierarchy of direct regulation. These results prove that JODA is a step forward to a systems level, mechanistic understanding of changes in gene regulation between different cell populations.</p

    Classification of microarray data using gene networks

    Get PDF
    BACKGROUND: Microarrays have become extremely useful for analysing genetic phenomena, but establishing a relation between microarray analysis results (typically a list of genes) and their biological significance is often difficult. Currently, the standard approach is to map a posteriori the results onto gene networks in order to elucidate the functions perturbed at the level of pathways. However, integrating a priori knowledge of the gene networks could help in the statistical analysis of gene expression data and in their biological interpretation. RESULTS: We propose a method to integrate a priori the knowledge of a gene network in the analysis of gene expression data. The approach is based on the spectral decomposition of gene expression profiles with respect to the eigenfunctions of the graph, resulting in an attenuation of the high-frequency components of the expression profiles with respect to the topology of the graph. We show how to derive unsupervised and supervised classification algorithms of expression profiles, resulting in classifiers with biological relevance. We illustrate the method with the analysis of a set of expression profiles from irradiated and non-irradiated yeast strains. CONCLUSION: Including a priori knowledge of a gene network for the analysis of gene expression data leads to good classification performance and improved interpretability of the results

    Methods for evaluating clustering algorithms for gene expression data using a reference set of functional classes

    Get PDF
    BACKGROUND: A cluster analysis is the most commonly performed procedure (often regarded as a first step) on a set of gene expression profiles. In most cases, a post hoc analysis is done to see if the genes in the same clusters can be functionally correlated. While past successes of such analyses have often been reported in a number of microarray studies (most of which used the standard hierarchical clustering, UPGMA, with one minus the Pearson's correlation coefficient as a measure of dissimilarity), often times such groupings could be misleading. More importantly, a systematic evaluation of the entire set of clusters produced by such unsupervised procedures is necessary since they also contain genes that are seemingly unrelated or may have more than one common function. Here we quantify the performance of a given unsupervised clustering algorithm applied to a given microarray study in terms of its ability to produce biologically meaningful clusters using a reference set of functional classes. Such a reference set may come from prior biological knowledge specific to a microarray study or may be formed using the growing databases of gene ontologies (GO) for the annotated genes of the relevant species. RESULTS: In this paper, we introduce two performance measures for evaluating the results of a clustering algorithm in its ability to produce biologically meaningful clusters. The first measure is a biological homogeneity index (BHI). As the name suggests, it is a measure of how biologically homogeneous the clusters are. This can be used to quantify the performance of a given clustering algorithm such as UPGMA in grouping genes for a particular data set and also for comparing the performance of a number of competing clustering algorithms applied to the same data set. The second performance measure is called a biological stability index (BSI). For a given clustering algorithm and an expression data set, it measures the consistency of the clustering algorithm's ability to produce biologically meaningful clusters when applied repeatedly to similar data sets. A good clustering algorithm should have high BHI and moderate to high BSI. We evaluated the performance of ten well known clustering algorithms on two gene expression data sets and identified the optimal algorithm in each case. The first data set deals with SAGE profiles of differentially expressed tags between normal and ductal carcinoma in situ samples of breast cancer patients. The second data set contains the expression profiles over time of positively expressed genes (ORF's) during sporulation of budding yeast. Two separate choices of the functional classes were used for this data set and the results were compared for consistency. CONCLUSION: Functional information of annotated genes available from various GO databases mined using ontology tools can be used to systematically judge the results of an unsupervised clustering algorithm as applied to a gene expression data set in clustering genes. This information could be used to select the right algorithm from a class of clustering algorithms for the given data set

    Integrating Phosphorylation Network with Transcriptional Network Reveals Novel Functional Relationships

    Get PDF
    Phosphorylation and transcriptional regulation events are critical for cells to transmit and respond to signals. In spite of its importance, systems-level strategies that couple these two networks have yet to be presented. Here we introduce a novel approach that integrates the physical and functional aspects of phosphorylation network together with the transcription network in S.cerevisiae, and demonstrate that different network motifs are involved in these networks, which should be considered in interpreting and integrating large scale datasets. Based on this understanding, we introduce a HeRS score (hetero-regulatory similarity score) to systematically characterize the functional relevance of kinase/phosphatase involvement with transcription factor, and present an algorithm that predicts hetero-regulatory modules. When extended to signaling network, this approach confirmed the structure and cross talk of MAPK pathways, inferred a novel functional transcription factor Sok2 in high osmolarity glycerol pathway, and explained the mechanism of reduced mating efficiency upon Fus3 deletion. This strategy is applicable to other organisms as large-scale datasets become available, providing a means to identify the functional relationships between kinases/phosphatases and transcription factors

    EZH2 promotes a bi-lineage identity in basal-like breast cancer cells

    Get PDF
    The mechanisms regulating breast cancer differentiation state are poorly understood. Of particular interest are molecular regulators controlling the highly aggressive and poorly differentiated traits of basal-like breast carcinomas. Here we show that the Polycomb factor EZH2 maintains the differentiation state of basal-like breast cancer cells, and promotes the expression of progenitor-associated and basal-lineage genes. Specifically, EZH2 regulates the composition of basal-like breast cancer cell populations by promoting a ‘bi-lineage’ differentiation state, in which cells co-express basal- and luminal-lineage markers. We show that human basal-like breast cancers contain a subpopulation of bi-lineage cells, and that EZH2-deficient cells give rise to tumors with a decreased proportion of such cells. Bi-lineage cells express genes that are active in normal luminal progenitors, and possess increased colony-formation capacity, consistent with a primitive differentiation state. We found that GATA3, a driver of luminal differentiation, performs a function opposite to EZH2, acting to suppress bi-lineage identity and luminal-progenitor gene expression. GATA3 levels increase upon EZH2 silencing, mediating a decrease in bi-lineage cell numbers. Our findings reveal a novel role for EZH2 in controlling basal-like breast cancer differentiation state and intra-tumoral cell composition

    Deterministic Effects Propagation Networks for reconstructing protein signaling networks from multiple interventions

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Modern gene perturbation techniques, like RNA interference (RNAi), enable us to study effects of targeted interventions in cells efficiently. In combination with mRNA or protein expression data this allows to gain insights into the behavior of complex biological systems.</p> <p>Results</p> <p>In this paper, we propose Deterministic Effects Propagation Networks (DEPNs) as a special Bayesian Network approach to reverse engineer signaling networks from a combination of protein expression and perturbation data. DEPNs allow to reconstruct protein networks based on combinatorial intervention effects, which are monitored via changes of the protein expression or activation over one or a few time points. Our implementation of DEPNs allows for latent network nodes (i.e. proteins without measurements) and has a built in mechanism to impute missing data. The robustness of our approach was tested on simulated data. We applied DEPNs to reconstruct the <it>ERBB </it>signaling network in <it>de novo </it>trastuzumab resistant human breast cancer cells, where protein expression was monitored on Reverse Phase Protein Arrays (RPPAs) after knockdown of network proteins using RNAi.</p> <p>Conclusion</p> <p>DEPNs offer a robust, efficient and simple approach to infer protein signaling networks from multiple interventions. The method as well as the data have been made part of the latest version of the R package "nem" available as a supplement to this paper and via the Bioconductor repository.</p
    corecore